Beyond the Dataset

Deep dive into the Big Five Personality Test and General Applications/Trends in Society

NOTICE: This section is currently a work-in-progress. I’ll be continuously adding various bits and pieces, and you’ll absolutely encounter awkward sentences + spelling/grammar mistakes. Welcome to the slow process of analysis, and enjoy your stay (while it lasts)!

Definitions

When discussing the Big Five, there are several bits of terminology that needs to be understood:

Trait: “Person’s typical style of thinking, feeling, and acting in different kinds of situations and at different times” (Costa and McCrae 1988). Traits can be used to predict certain behaviors.

Personality: “Pattern of relatively permanent traits and unique characteristics that give both consistency and individuality to a person’s behavior.” They’re acquired through life experiences and are relatively stable.

Temperament: “Physical, mental, and emotional traits people are born with.”

Many consider temperament as a subset of personality. However, temperament often has their own model, with different traits. This model is called the CBQ (Children’s Behavior Questionnaire) and the three traits are:

Notice the overlap between the two - we’re only missing openness and agreeableness. However, despite the similarities, it should be acknowledged that the Big Five aims to cover personality traits, while integrating temperament into the mix.

While we’re talking about definitions, I should also clarify extroversion. The Big Five uses Eysenck’s concept of extroversion rather than Jung’s.

Eysenck: Extroverts gain and recharge their mental energy from external stimuli, such as social interaction. Introverts prefer to shield themselves from external stimuli and recharge their mental energy through withdrawing themselves.

Jung: Extroverts seek action and sensory input from the external world, using their experiences to influence themselves. Introverts immerse themselves in their internal environment, through reflection, dreaming, and understanding [themselves].

This clarification may be redundant – many people have only heard of Eysenck’s version of extroversion, as society tends to use his version as the ‘only’ version. However, I still find value in acknowledging Jung’s definition, especially because Eynsenck built his off of Jung’s.

Big Five Personality Test - Origins

“Can you find a taxonomy to describe human personality?”

That’s the idea that started the Big Five Personality Test. We wanted to understand human personality and map it out qualitatively. We needed to use the power of science to obtain even more knowledge!

Scientists have been interested in personality since the 1880s, with the first official study of personality done by Gordon Allport and Henry Odbert in 1936. First, they gathered 18000 personality-describing words from the Webster’s Dictionary. Then, using this list, found 4504 adjectives to describe non-physical characters, creating the first ‘personality wordbank’.

Then, in 1943, Raymond Cattel reduced this wordbank to ~160 traits, removing any words with similar meanings. He proceeded to add 22 more words to describe “interest” and “abilities”, then created “personality cultures” to group these words. By 1948, he narrowed it down to 36 terms with 12 personality cultures.

In 1947, Eysenck introduced his book “Dimensions of Personality”, creating his own version of “Extraversion” and coining the term “Neuroticism”.

In 1949, Donald Fiske takes 22 terms from Cattell’s study, and creates give main categories: “Social Adaptability”, “Emotional Control”, “Conformity”, “Inquiring Intellect”, and “Confident Self-expression”. This is the beginning of the Big Five test.

Although this test has undergone several revisions and expansions from various other researchers (i.e. Norman (1967), Smith (1967), McCrae and Costa (1987), and more), the purpose and ideas within the test remain.

Strengths

There’s a lot of reasons why the Big Five is such a popular test and remains at the forefront of personality research.

  1. The traits the Big Five measures are relatively stable during adulthood.

As we grow older, our traits are going to naturally develop and change. For example, as you get older, you tend to have higher conscientiousness and agreeableness and lower neuroticism, extroversion, and openness. This is mainly because people tend to have more responsibilities (i.e. family, job, etc.) as they age, and eventually learn to build traits to adapt. This is called the maturation effect.

However, after adolescence, people’s traits tend to stabilize, ensuring there are distinct behavior patterns to analyze. This consistency (leading to replicability) is desirable for scientists.

  1. Widely accepted by the scientific community

Many scientists have conducted studies to confirm the validity and reliability of the Big Five Personality Test. This includes Satow, 2021 and Kamarulzaman and Nordin, 2012). Using confirmatory factor analysis, invariance analysis, and various other empirical tests, the Big Five test is considered to be validated and reliable, and is now the most widely accepted test within this niche.

  1. Very easily repeatable

Scientists value repeatability. Got a fascinating result? Make sure to do the experiment 2 more times, just in case something was off!

The Big Five test is easily repeatable, as it’s generally delivered as a 50-question Likert scale test. It can easily be printed and distributed to thousands of participants, ensuring consistency between studies.

  1. Simplicity and Generalization

As a result of constantly improving the Big Five personality test, the traits we’ve selected are:

  1. Easy to understand and specific - Traits are clearly defined and simple
  2. Mutually exclusive - Makes it easy to quantify
  3. Generalizable - These traits are universal and can apply to any culture

Together, these strengths make the Big Five the default when considering personality.

Applications in Society

You can apply the Big Five personality traits to predict future behavior.

A popular example is that high conscientiousness and emotional stability (low neuroticism) are correlated with strong job performance and higher wages. This information could potentially shape businesses’ hiring and recruiting practices, as they’re already begun to use the Myers–Briggs Type Indicator (MBTI) to better understand candidates and narrow down the applicant pool.

This has the potential to be a positive addition: Bad hires can cost “up to 30% of the employee’s first-year earnings” - a significant loss for the business if “46% of newly-hired employees [are] deemed failures”. However, since the Big Five personality test is a self-assessment, lying on it becomes a trivial task.

High conscientiousness is also a general indicator for superior academic performance. This is likely because conscientiousness and agreeableness were “positively related with all four learning styles (synthesis analysis, methodical study, fact retention, and elaborative processing)”. It was also concluded that the Big Five “explained 14% of the variance in [a student’s] grade point average” whereas “learning systems explained an additional 3%”, establishing a clear connection between personality traits and learning styles improving one’s academics.

Certain traits can also correlate with specific learning styles. In a study, there were four types of learning styles analyzed:

Meaning Directed: Focuses on self-regulated learning and critical information synthesis, developing independent thinking and establishing connections between ideas and concepts.

Reproduction Directed: Focuses on memorizing and rehearsing information, directed by an external figurehead (teacher). Typically, the information is memorized to be reproduced/recalled on a test.

Application Directed: Focuses on understanding and learning to meet certifications of accreditations, focus on real-world application and examples.

Undirected: Enjoys cooperative learning, directed by an external figurehead, has difficulty finding ways to approach their studying and self-regulation.

These learning styles are apart of Vermunt’s Inventory of Learning and has been validated through several studies. However, it has fallen out of favor with the public since the introduction of the VARK (visual, aural, read/write, and kinesthetic).1

The correlations are as shown in the table below, with a ‘+’ representing a positive correlation, a ‘-’ representing a negative correlation. Otherwise, a blank cell means that there is no correlation between the two variables.

This connection could potentially allow teachers to better help struggling students, using different tactics to best assist them. For example, neurotic and undirected students could benefit from more direction on the task, asking other students to form groups and help teach each other the concepts, or giving them additional assistance while others continue other work. This ‘other work’ could also be personalized to better suit a student’s preference. Teachers could opt for more independent assignments for conscientious students, or give more thorough instructions to students who are more agreeable.

There are also various patterns with Big Five traits and health related information.

First, conscientiousness is the strongest predictor for reduced morality — being a conscientious person makes you 30% less likely to die, compared to other people. This is because these people tend to make better health choices, as they’re more likely to stay fit, cooperate when given medical advice, and have better sleeping habits, while also being less likely to smoke. Conscientiousness is also tied to having more supportive relationships, thriving at work, and having better stress management.

People with Alzhimers typically have a massive decrease in conscientiousness and large increase in neuroticism. Decreases in extraversion, openness, and agreeableness were also spotted. Hence, these Big Five traits could assist an early diagnosis for the patient.

Heroin and Ecstasy users also showed patterns when testing their Big Five traits. Heroin users had high neuroticism and openness, while Ecstasy had high extraversion and openness. Both had lower agreeableness and conscientiousness. Mental disorders are also linked to neuroticism (especially high neuroticism during adolescence).

Limitations of the Big Five Test

Lexical Hypothesis Despite the Big Five’s numerous applications in society, rigorous tests for reliability and validity, and general acceptance within the scientific community, there are various criticisms of the test.

One of the biggest criticisms of the test is the basis of lexical hypothesis. Lexical hypothesis assumes that “the most important personality traits are encoded as words…. and that the analysis of [these words] may lead to a scientifically acceptable personality model”. Hence, by using language (or for the Big Five, a dictionary) as a resource, researchers can create a list of important personality traits to judge people off of.

People are against the lexical hypothesis for various reasons:

Verbal descriptors result in social bias (i.e. people consciously or unconsciously choose traits that suit what society deems as ‘good’. For example, people might say they’re ‘agreeable’ when they’re not because society favors agreeable people more). This can greatly skew the data, with traits such as Extroversion, Agreeableness, and Neuroticism becoming invalid as researchers can not ensure that people will answer honestly and objectively.2

Others also deem personality as ‘too complex’ to be encoded into a singular word or used in everyday language. We may need far more detail to fully express what characteristics we want to capture, as language may not be an adequate vessel to describe the human condition and personality. Traits can also be too ambiguous – they could be misunderstood or their meaning could change overtime.

The lexical hypothesis also has little scientific backing. The terms used with lexical hypotheses were developed for daily use, and could represent different meanings/perceptions rather than the definition researchers aim to use. Furthermore, tests to validate the lexical hypothesis are deemed ‘unscientific’.

(In)correct Approaches There are two approaches for the Big Five: Etic: “Traits are universal, regardless of the environment, culture or context” Emic:”Traits are culture and context-specific”

Often, Big Five studies use the Etic approach, generalizing the results to encompass the entirety of the human population, rather than specificity.

The Etic (or universal) approach was led by Eysenck, as he tried to emphasize how these traits were more ‘biological’ and universal to every human language, regardless of one’s culture or environment. It is used to describe a generalized personality, then expand these personality traits to other species.

The Emic (or local) approach believes that the cultures heavily shape personality traits. For example, with the test’s origins stemming from lexical hypothesis, the current model proposes a Western-centric view of personality traits, rather than a holistic view.

Thus, the question becomes: Do you value a generalized and universal model to encompass the entire human condition? Or, do you take a more personalized approach, envisioning how one’s environment can influence their behavior?

Culture and Sex Differences Korean, Mexican, Indian, Filipino and Arabic cultures have reported that there are many special traits that only exist within their culture – something that can’t be generalized and would destroy our current Etic approach.

This is further explored by using lexical hypothesis in Chinese, replicating the origins of the Big Five test in Western countries. By utilizing the lexical hypothesis, researchers created a completely different quiz with different traits, suited to a Chinese audience.

In fact, let’s look at the Chinese Personality Assessment Inventory (CPAI), containing four major personality factors:

  1. Social potency (leadership, divergent thinking, novelty, extraversion)
  2. Dependability (responsibility, optimism vs pessimism, face3, locus of control)
  3. Accommodation (Family orientation, self vs society, defensiveness, graciousness, logical vs affective orientation)
  4. Interpersonal relatedness (individual vs social, harmony, discipline, traditionalism vs modernity)

There is some overlap with the Big Five.

Although there is overlap, an interesting idea becomes clear: The Chinese do not value the same things as Western audiences, due to cultural differences. For example, there are many openness-related traits, such as divergent thinking and novelty within the social potency category. However, it’s more so imbued within another category, functioning “in a complex context than a distinct construct” compared to the Western environment. Furthermore, the CPAI also contains extroversion alongside openness. This combination of traits could be a “more interpretable construct” within the Chinese taxonomy of personality, when they tend to be far more distinct and separated within a Western audience.

In addition, the definition of openness is quite varied and can completely change depending on one’s culture. Openness requires breaking free of tradition and embracing new ideas. They must be motivated by aesthetic sensitivity rather than rules — an idea which can be discouraged in Asian cultures. It also begs the question: Is their opinion truly what they think, or were they just taught to think that? If they’re taught to think that, does that matter? Is that their personality?

The issues of an Etic approach are also found in Arabic language and culture when attempting to create a more indigenous test, where differences in results were prominent enough to birth a completely new sixth trait: religiosity.

The Big Five Personality Test results also seem to change depending on one’s sex. When collecting results in Turkey, men were “more imaginative and inquisitive” compared to women. This is likely because of the patriarchal culture within Turkey, as they have very strict gender roles; men were able to pursue what they wanted while women were shoehorned into rigid and rule-abiding lifestyles. So, men would likely cultivate openness as it was encouraged by society, while women would stray away from it.

This also happens on a smaller scale with Western countries through the stereotypical socialization of girls and boys when they grow up. In fact, these differences in personality traits can be seen in early childhood.

In Canada, China, Finland, Germany, Poland, and Russia, men tend to be “more assertive and risk-taking than women”, hence scoring high on extrovertism and openness, whereas women were more “anxious and tender-minded”, scoring higher in scored higher in neuroticism, agreeableness, and conscientiousness” compared to men.

When analyzing results from 55 nations, the trend changes slightly: women scored higher in neuroticism, agreeableness, extroversion, and conscientiousness. However, men scored higher in assertiveness and openness. Socioeconomic factors also influenced gender differences, where the largest differences measured were larger in healthy, rich and gender-egalitarian cultures. Furthermore, men in developed world regions were “less neurotic, extraverted, conscientious and agreeable, compared to men in less developed world regions”. Women did not have their personality differ based on their socioeconomic status.

Of course, there are various reasons for these differences in personality based on sex, such as social roles and evolutionary reasons. For example, due to parental investment, men are more open to risks and tend to be more socially dominant, while women tend to be more nurturing and cautious. However, with these results, can you measure and generalize certain traits such as openness and extraversion accurately when it is actively being manipulated by cultural, societal, and evolutionary factors?

Lack of Representation Although this test samples hundreds of locations, it’s not completely representative of the population it claims to show. This is because a majority of our data comes from Western, educated, industrialized, rich and democratic (WEIRD) populations. Less developed countries (such as the ones within Africa) are often glossed over and represent a tiny amount of data points.

A clear example of this is the data set that I’ve been analyzing. When we look at the data points on the map (to find trends between countries), it’s clear that much of Africa has been omitted from the data.4 It brings the reliability of the data set into question: can we rely on WEIRD populations to make generalizations, and assume it will be correct? Are these results universally applicable?

For the Big Five, studies have shown that non-WEIRD populations do not follow the same trends as WEIRD populations. In fact, for low and middle income countries, Big Five personality questions fail to measure the personality traits they’re built for, and have low validity. This is a huge contrast to the high validity seen in internet surveys from the same low/middle income countries.

Previously, it has been mentioned that conscientiousness and emotional stability are more correlated with income. It also intuitively makes sense — if you can work hard and have good stress management, workers could perform their job better. Yet, when analyzing non-WEIRD populations, it was seen that conscientiousness was not a predictor for increased income. Instead, only emotional stability and openness were predictors. However, openness items “poorly differentiate from other items… one cannot exclude that this result is driven by systematic responses biases” and other external factors. One must remain cautious between correlation and causation.

During the process of surveying non-WEIRD populations, it should also be noted that errors may occur due to the quality of translation (i.e. wording of question, how they are interpreted within a specific culture) via an enumerator, as there was effort to adapt to respondents with lower cognitive ability. Thus, the impact of an enumerator’s face-to-face interactions with respondents should be taken into consideration. It was also noted that less educated populations had lower interest in taking the survey, as their incentives and expectations for the survey may be completely different than a respondent in a WEIRD country who is taking the Big Five test for fun.

Inconclusive Studies Assuming that we continue to use an ‘etic’ or an universal approach, various studies have shown differing results, decreasing the reliability of the Big Five as a universal measuring tool of personality.

For example,

Under the ‘Etic’ or universal approach, it’s difficult to justify these differences.

Variability of Tests


  1. The VARK model also has many critiques for various reasons.↩︎

  2. You can see how this has affected this data set, most notably with agreeableness and openness.↩︎

  3. The public image you project - your ‘face’ in society.↩︎

  4. Recall that I removed any countries that had less than 10 responses to reduce skewing and better visualize the data.↩︎